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Introduction 


Statistical  techniques  as  a  rule  require  that  some  sort 
of  randomization  has  been  used  in  obtaining  data.  It  is 
generally  assumed  that  the  design  of  experiment  or  design  of 
a  survey  resulting  in  the  data  did  use  some  available  method 
of  randomization.  Most  of  the  statistical  procedures  have 
been  developed  on  the  assumption  that  the  available  data 
arose  from  certain  uses  of  the  principle  of  randomness □ 

However,  in  statistical  practice,  very  little  attention  is 
paid  to  confirming  this  basic  assumption  of  randomness  in 
data  analysis.  There  may  be  possibilities  of  grave  errors 
in  assuming  the  randomness  of  a  given  set  of  data  while  it 
may  have  occurred  from  a  known  or  an  unknown  type  of  bias. 

In  simulation  studies,  extensive  use  is  made  of  random 
numbers  generated  from  computers.  Many  theoretical  studies 
in  modern  science  depend  heavily  on  the  generation  of  random 
numbers  or  pseudo-random  numbers.  In  the  present  state 
lotteries  and  daily  number  games,  random  numbers 
are  produced  in  millions.  Highly  sophisticated  routines 
have  been  developed  to  test  generation  of  such  numbers.  The 
users,  as  a  rule,  do  not  test  the  randomness  of  numbers  as 
generated  by  well  known  computer  packages  though  important 
uses  of  such  random  numbers  are  made  in  real  problems. 

There  are  several  situations  in  problems  of  sciences 
and  industry  where  data  are  generated  in  series.  Examples  ' 

— j  vodes 
jAvail  and/or 
Dist  |  Special 


□  □ 
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of  time  series  data  abound  in  problems  in  economics,  geology, 
medicine,  climatology,  energy,  etc.  and  judicious  application 
of  statistical  time  series  requires  answers  to  questions  of 
randomness „ 

Randomness  implies  different  meanings  in  different  con¬ 
texts.  Usually  a  random  sample  from  a  population  means  that 
we  have  observed  a  set  of  identically  and  independently 
distributed  random  variables  from  a  population  with  a  specified 
probability  model.  Sir  Ronald  A.  Fisher  made  a  major  contri¬ 
bution  to  scientific  experimentation  by  introducing  the  con¬ 
cept  of  randomization  at  the  stage  of  design-.  Fisher  developed 
tests  for  the  sample  resulting  from  randomization  process  and 
recommended  alternative  strategies. 

When  randomization  leads  to  a  bad  looking  experiment  or 
sample,  Fisher  said  that  the  experimenter  should,  with  discretion 
and  judgement,  put  the  sample  aside  and  draw  another,  Savage 
(1976). 

Importance  of  testing  for  randomness  in  statistical 
practice  is  not  recognized  in  spite  of  warnings  of  statisti clans . 
In  a  recent  paper,  Federer  (1978)  noted  that  "many  statisticians 
and  teachers  of  statistics,  assume ,  but  do  not  veri fy ,  that 

they  have  a  random  sample  from  a  prescribed  population." 

The  literature  on  randomness  is  extensive  and  scattered 
over  many  disciplines.  The  lis^  of  references  includes  many 

such  studies.  In  this  paper  a  brief  account  is  given  of 
commonly  used  tests  <?f  randomness  for  samples  , 


for  generated  random  numbers,  for  time  series  and  for  discrete 
random  events.  An  example  from  coal  mining  disasters  in 
England,  recently  discussed  by  Jarrett  (1979)  is  studied 
in  detail  and  the  data  is  subjected  to  analysis.  Random 
numbers  obtained  from  lottery  games  in  Ohio  are  used  to 
illustrate  some  other  tests. 

2 .  Hypothesis  of  Randomness 

A  common  formulation  of  the  hypothesis  of  randomness 
is  in  terms  of  identically  and  independently  distributed 
observations  from  an  unknown  distribution  function.  Given 
that  X^,  X2  ,  Xn  is  a  sample  from  a  population  having 

the  cumulative  distribution  function  F(x),  the  hypothesis 
of  randomness  is  stated  in  the  following  manner. 

Hq  :  X-^ ,  X2  ,  •  •  •  ,  Xn  are  independently  and  identically 

distributed  with,  probability  distribution  function  F(x)  versus 
the  alternative  hypothesis  that  they  are  not.  For  convenience, 
a  further  assumption  is  made  about  the  continuity  of  F(x)  so 
that  its  density  function  f(x)  exists.  The  test  statistic  is 
highly  dependent  on  the  form  of  the  alternative.  Suppose  the 
alternative  is 

K:  X, ,  X0,  ...»  X  are  distributed  independently 

with  distributions  F^(x),  F2(x),  ...»  F^(x) 
respect ively . 

The  most  powerful  test  of  H0  in  the  Neyman-Pearson  set  up, 
is  a  test  based  on  the  sample  ranks  such  that  it  provides 
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a  critical  region  of  given  si2e  a.  Except  in  simple  cases, 
the  distribution  theory  of  the  test  statistic  cannot  be 
obtained  explicitly  and  the  test  is  of  limited  use. 

Theorem  6. A  of  Hajek  (1969)  which  is  essentially  a 
reformulation  of  Neyman-Pearscn  lemma,  provides  the  main 
result . 

There  are,  however,  many  useful  nonparametric  tests 
which  test  the  hypothesis  of  randomness,  against  alternatives 
which  are  in  terms  of  two  samples.  Suppose  now  that  the 
alternative  is  given  by 

H2 ;  ,  X2 ,  . . . ,  Xm  is  independently  and  identically 

distributed  as  F^x)  and  Xm+1,  X^  is 

distributed  independently  and  identically  as 
F2 (x) . 

A  special  case  of  the  alternative  hypothesis  is  obtained 

if  F,  and  differ  in  location  or  scale.  Several  of  these 

x  l 

situations  are  discussed  by  Haj£k  resulting  in  many 
classical  nonparametric  tests. 

3 .  Tests  for  Random  Numbers 

There  is  a  large  variety  of  tests  available  for 
testing  whether  the  given  set  of  digits  are  independently 
uniformly  distributed.  This  is  what  is  ordirarily  meant 
by  the  word  "random"  in  the  generation  of  random  numbers. 
Knuth  (1968)  has  provided  a  long  list  for  such  tests. 


Usually  a  generator  is  not  regarded  as  good  random  number 
generator  unless  it  passes  at  least  half  a  dozen  tests  of 


randomness.  The  reason  is  that  "randomness"  has  various 
"attributes".  The  alternative  hypotheses  to  that  of 
testing  the  null  hypothesis  that  ' s  are  independently  and 
identically  uniformly  distributed, are  too  many.  We  shall 
formulate  these  alternative  hypotheses  in  the  following 
and  give  the  tests  which  are  commonly  used  in  practice. 

Many  of  the  following  tests  are  found  in  Knuth  (1968). 

(i )  Chi-squared  test 

Hq :  The  digits  are  distributed 

independently  and  identical ly  with  equal 
probability  of  being  0,  1,  ...,  9. 

:  They  are  not. 

The  usual  statistic 

counts  the  observed  numbers  of  digits  0,  1,  2,  ...,  9  in  the 
sample  and  compares  these  frequencies  with  the  expected 
frequency  for  the  sample.  For  a  large  sample  of  the  digits, 
this  statistic  has  a  X -distribution .  Power  studies  of  this 
test  are  available  in  the  literature. 

(ii)  Equidistribution  test.  This  can  be  performed  by  using 
Kolmogorov-Smirnov  statistic  to  test  uniformity  of  the 
real  valued  sequence  so  generated.  The  discrete  form  of 
Kolmogorov-Smirnov  is  applied  if  the  numbers  are  rounded 
off  to  integers. 
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(iii)  Serial  test.  It  uses  Chi-squared  statistic  to  compare 
adjacent  pairs  or  triples  of  numbers.  Good  (1957)  has 
discussed  the  distribution  of  Chi-squared  statistic  if  both 
equidistribution  test  and  serial  test  are  used  °n  the 

same  data. 

(iv)  Gap  test.  Two  numbers,  a,  B  are  chosen  so  that  the 

length  of  runs  of  the  numbers  between  a,  B  is  used  as  a 
statistic.  If  p  =  3-a ,  the  test  of  goodness-of-f it  is 

used  to  compare  the  observed  values  with  the  expected 
values  p ,  p(l-p),  p(l-p)2,  ...,  pfl-p)1,  when  the  number  of 
possible  runs  is  t+1. 

(v)  Poker  list :  Using  any  set  of  five  successive  integers, 
the  frequency  of  various  combinations ,  as  found  in  the  game 
of  poker,  are  used  to  test  goedness-of-f it .  Under  the 
hypothesis  of  randomness  these  probabilities  are  given 
below. 

Bust  (abode)  =  .3024 
Pair  (a  a  b  c  d)  =  .5040 

2  pairs  (a  a  b  b  c)  =  .1080 

3  of  a  kind  (a  a  a  b  c)  =  0.0720 
Full  house  (a  a  a  b  b)  =  .0090 

4  of  a  kind  (a  a  a  a  b)  =  0.0045 

5  of  a  kind  (a  a  a  a  a)  -  .0001 

(vi)  Coupon  collector's  test.  The  length  of  sequences  are 
observed  so  as  to  get  integers  from  0,  1,  2,  ...,  d-1. 
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If  p  i 
r 

r  =  0, 

Pr 


the  probability  that  r  digits  are  needet;, 
2  ,  . . . ,  d- 1 ,  we  have 


d-1 


jU-  l  (-I)V(d*1)(d-l- 


v) 


r-1 


dA  x  v=0 


r  =  d ,  d  +  1 ,  ,  ,  , 

(vii)  Permutation  test.  The  sequence  is  divided  into  n 
groups  of  t  elements.  The  number  of  times  each  ordering 
appears  is  counted.  A  Chi-squared  test  is  utilizec,  since 
the  probability  of  a  given  ordering  is  l/(t!). 

(viii)  Runs  test  The  statistic  used  is  the  number  of  runs 
up  or  down.  By  a  run  is  meant  the  length  of  increasing 
(or  decreasing)  sequences  of  integers.  The  distribution 
of  runs  of  various  length  is  well  known  and  is  given  in 
recent  text  books. 

(ix)  Maximum  of  t.  For  the  given  sequence  cf  random 

numbers  U  ,  U  ,  ...,  U  ,  consider  subsequences  of  length  t. 
li  n 


V.  =  max  (Uj5  U.+1,  . 


!!  ) 
j+t-r 


for  j  =  1,  2,  ...,n.  Using  VQ  ,  Vn_1  as  the 

observations  from  F(x)  =  xr ,  0  <_  x  <  1  ,  we 

have  the  usual  Kolmogorov-Smirnov  statistic  for  testing 
rv^ric  ouzri  °  s 

(ix)  Serial  correlation  test.  The  test  uses  the 
statistic  which  computes  the  serial  correlation  for  the 
generated  numbers.  If  the  hypothesis  of  randomness  is 


Let 
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to  be  rejected,  the  sample  serial  correlation  should  be 
large . 

4 .  Randomness  of  Series  of  Events 

In  a  given  series  of  events  such  as  coal  mining 
accidents,  computer  failures, occurrence  of  prizes  in  a 
lottery  or  arrival  of  a  cancer  patient  at  a  clinic, 
interest  may  center  around  the  randomness  of  these  events. 
Cox  and  Lewis  (1966)  have  given  several  applications  where 
problems  of  randomness  of  series  of  events  are  discussed. 

We  consider  first  the  case  of  binary  events. 

In  a  recent  paper,  Larsen  et .  al  (1973)  have  studied 
the  hypothesis  of  randomness  of  binary  events  with  the 
alternative  of  unimodal  clustering.  Assume  that  a  sequence 
of  n  Bernoulli  trials  with  m  successes  has  the  order  of 
ith  success  given  by  with 

1  <  <  V2  <  •  •  '  <  >'m  -  n » 

y^  are  integers  between  1  and  n.  Let  m  =  2r  if  m  is  even 
and  2r  +  i  if  m  is  odd.  Then 

m 

Ki  :  3  i^i  -  yr+ii 

i=  i 

is  used  to  study  unimodal  clustering.  The  hypothesis  of 
randomness  is  rejected  when  is  small.  When  the  data 
consists  of  s  such  sequences,  the  statistic  can  be 

formed  by  summing  for  each  sequence. 


It  is  shown  by  Larsen  et  al  that , 
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E(KX) 


(n+l)[f3t 


m-irni+1  j 


m+1 


V(K  )  =  r(n+l)  Cn-m)((m+l ) 2  -  2r2  -3(ml 

1  12{2[2^i]  +  1}* 

where  [x]  =  largest  integer  net  exceeding  x  and 
6(m)  =  r-2  when  m  is  odd 
=  2r-l  ,  m  is  even 

Approximate  tests  are  constructed  using  asymptotic 
theory  for  Kg. 

Other  alternatives  to  randomness  are  considered  by 
several  authors.  For  example,  O'Brien  (1976)  studied  the 
alternative  of  multiple  clustering  to  randomness  in  the 
case  of  binary  data.  It  is  assumed  that  in  the  number  of 
N  trials  observed,  m  are  successes  and,  N-m  =  n  failures 
such  that  m  >_  n. 

Let  y ^  be  the  number  of  successes  prior  to  ith  failure 

but  subsequent  to  (i-l)th  failure.  Let  y  be  the  average 
-  yi 

length,  y  =  L  •  L-et 

i=  1 

n  +  1 

l  <y.py>2 
2  i=l 

s  =  - n - 

The  distribution  of  — —  has  been  tabulated  by  Uixon  (  1940)  for 

m  2 

m,  n  <  10.  Dixon  also  showed  that  for  m,  n  >  10 ,  distribution 
of  cs2  is  approximately  chi-squared  with  c  -  ■>  v  ~  j  ■ 

The  hypothesis  of  randomness  is 
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rejected  when  s2  is  large  or  small. 

A  test  of  randomness  using  the  coefficient  of  variation 
(CV)  where  X1  ,  X2 ,  . . .  ,  Xn  are  independently  and  identically 
exponentially  distributed  was  studied  by  Moran  (1951).  He 
showed  that  k(CV)2  has  where 

k  =  n/2 
V  =  n/2 

approximately.  Asymptotically  the  distribution  of  (CV)? 
is  normal.  O'Brien  (1976)  has  given  comparison  of  the 
actual  Chi-squared  statistic  and  the  observed  value  based 
cn  simulation.  This  approximation  does  not  seem  very  good 
except  for  upper  percentile  points. 

5 •  Tests  of  Randomness  in  Spacial  Situations. 

There  are  several  situations  in  which  random  phenomena 
occur  in  space.  For  example,  one  may  be  interested  in  the 
distribution  of  points  on  a  line  or  in  plane.  Related 
problems  occur  in  tests  using  geometrical  methods  such 
as  tests  of  certain  hypotheses  utilizing  graphical  techniques. 
In  the  case  of  multivariate  data,  several  procedures  have 
been  found  to  be  useful  for  preliminary  study  of  data. 

Some  of  the  early  tests  of  randomness  for  points  on  a 
line  can  be  transformed  to  the  test  of  hypothesis  of 
independent  and  identically  distributed  uniform  random 
variables.  Pearson  (1963)  has  given  comparisons  of  four 
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tests  based  on  Kolmogorov-Smirnov  and  von  Mises  statistics 
and  their  standardized  forms. 

In  two-dimens ions ,  several  models  have  been  discussed 
by  authors  to  derive  tests  for  the  randomness  of  points  in 
a  plane. 

Brown  and  Rothery  (1978)  have  discussed  the  hypothesis 
of  randomness  of  points  in  a  plane  formulated  as  the  points 
forming  a  twc-dimensional  Poisson  point  process.  They  have 
proposed  two  statistics  which  are  sensitive  to  the  alternative 
hypothesis  of  local  regularity.  One  is  the  squared  coefficient 
of  variation  of  squared  nearest  neighbor  distances  and  the 
other  is  the  ratio  of  the  geometric  mean  to  the  arithmetic 
mean  of  the  squared  distances.  Distribution  of  these  statistics 
are  given  with  the  help  of  computer  simulations  and  numerical 
approximations . 

Let  n  points  be  distributed  in  a  given  region.  Suppose 
\’i ,  v 2 ,  .  ..,  vn  are  the  squared  distances  to  the  nearest 
neighbor  of  a  randomly  selected  individual. 

Let 


1 ( v • -D) 2 

S  =  - - - 

(n-l)D2 
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Tests  based  on  S  and  G  reject  the  hypothesis  of  randomness 
if  S  is  small  or  G  is  large. 

In  the  case  of  points  in  an  infinite  plane,  -2  log  G 
is  a  special  case  of  Bartlett’s  statistic  and  has  been 
extensively  studied,  see  for  example,  Glaser  (1976). 

However,  for  finite  planes  for  various  shapes  and  sizes 
simulations  have  to  be  used.  Brown  and  Rothery  (  197 8 )  have 
obtained  the  values  for  estimated  probabilities  in  the 
upper  tails  of  the  distribution  of  G  and  lower  tails  of 
the  distribution  of  S  based  on  1500  realization  for  the 
circle,  2000  realizations  for  the  square  and  1200  realizations 
for  various  rectangle  sizes  as  given  in  Tables  I  and  II,  for 
sample  sizes  of  25  and  36.  Recently  a  survey  of  tests 

of  randomness  for  spatial  joint  patterns  has  been  given  by 
Ripley(1979) .  The  asymptotic  distribution  theory  and  power 
of  tests  based  on  the  nearest-neighbor  distances  and 
estimates  of  the  variance  function  are  investigated  in 
this  study. 

6 .  Tests  of  Randomness  for  Time  Series  Data 

When  the  data  in  an  experiment  arises  in  a  sequence, 
the  natural  question  arises  about  dependence  of  observations. 
Ths  usual  alternatives  to  randomness  in  time  series  are 
those  of  trend  and  periodicity.  Kendall  and  Stuart  (1968) 
give  several  tests  for  randomness  in  time  series.  Commonly 
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used  tests  are  the  turning  points  test,  run  test,  rank 
correlation  test  and  difference-sign  test.  For  example, 
the  turning  points  test  is  performed  as  follows.  Let 
Up  U2,  .  .  .  ,  un  be  the  time  series. 

Let  p  be  the  number  of  turning  points  of  this  series. 
It  can  be  shown  that  mean  and  variance  of  p  are  given  by 


and 


E(p) 


_  2(n-2  ) 
- 3 - 


16n-29 

?T3 


under  the  assumption  of  randomness.  Using  normal  approximation, 
the  test  is  usually  performed.  The  rank  correlation  test  is 
performed  as  follows. 

Let  P  be  the  number  of  pairs  where  u^  >  u^,  j  >  i  for 
i,  j  =  1,  2,  .  .  .  ,  n.  Then  E(P)  =  .  if  q  is  the 

number  of  pairs  such  that  u.  <  u-,  j  >  i,  then  Kendall's  t's 


T  =  1 

It  is  well  known  that 


4Q 

hTh-TT 


and 


E(t)  =  0 


V(x) 


2 ( 2n  +  5) 

" 9n (n-1 )  » 


and  has  approximately  normal  distribution  whicn  is  then 
used  to  test  the  hypothesis  of  randomness.  The  details 
of  these  tests  are  available  in  Kendall  and  Stuart  (1976). 


7. 


Randomness  of  Treatment  Allocation  In  Experiments 


To  verify  the  claim  that  the  treatments  were  assigned 
at  random  in  an  experiment,  often  tests  using  randomization 
test  or  permutation  test  are  utilized.  To  perform  the 
permutation  test,  the  fact  that  the  conditional  distribution 
of  any  arrangement  of  the  ordered  observations  given  the 
values  of  the  ordered  statistics  is  uniform,  can  be  utilized. 

In  such  a  case,  an  appropriate  test  statistic  is  chosen  and 
the  value  of  this  statistic  is  calculated  for  each  arrangement. 
The  hypothesis  tested  is  that  there  is  no  difference  among 
the  treatments  assigned  at  random  to  experimental  units.  The 
hypothesis  is  rejected  for  an!  arrangements  which  give  the 
most  extreme  test  statistic.  Here  a  is  the  level  of  the  test 
and  n  is  the  number  of  observations.  Since  the  critical 
region  depends  on  the  sample  values  of  the  ordered  statistics, 
unlike  the  usual  case,  the  critical  region  can  only  be  obtained 
after  the  sample  has  been  observed.  Since  such  computations 
are  laborious  these  tests  are  not  usually  carried  out.  Details 
of  these  tests  are  given  by  Hajek  (1969). 

8 .  Miscellaneous  Tests 

(i )  Test  of  Randomness  of  Several  Rankings 

When  several  judges  rank  the  same  items,  one  is  interested 
in  the  test  of  randomness  of  these  ranking.  Usually  this  can 
be  tested  by  Friedman  Statistic.  Let  there  be  I  items  and  R^ 
be  the  sum  of  m  ranks  assigned  to  item  i,  i  =  1,  2,  ...,  I. 
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Friedman's  statistic  is  given  by 

xr  =  mTTT+TT  ^Ri2 ~  3m(I+1) 

For  large  m  and  I,  Friedman's  statistics  is  distributed  as 
Chi-squared  with  1-1  degrees  of  freedom  asymptotically. 


(ii)  Tests  for  .multivariate  normality 

Andrews  (1972)  nas  proposed  tests  for  assessing  multi¬ 
variate  noimality  for  p-dimensional  data.  Suppose  the  data 
are  transformed  so  as  to  have  ^ero  means  and  identity 
covariance  matrix.  Then  using  the  probability  transform 
we  have  points  in  a  hypercube.  The  nearest  distance  statistic 
is  given  by 

d(Xi5  X.)  =  maximint! xki  -  xkj |  , 

lxki  “  xkj I  " 

Volume  of  the  set  enclosed  by  a  distance  d  from  the  point 

V (d)  =  (2d)p  . 


*3  ls 


Since  X,,  are  uniformly  distributed »V (d )  has 

^  J- 

exponential  distribution,  with  a  parameter,  say  X.  Then  the 
conditional  probability  of  V(d)  £  V(d^)  given  that  <  dg  is 

p<di>  -  j.prvTJjr 


-  <t~ 1  (pCd^) )  . 


Let 
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If  are  calculated  from  disjointed  parts  of  the  unit 
hypercube,  they  should  not  show  any  dependence  on  the  center, 
x-  of  the  part  of  the  cube.  This  dependence  may  be  tested  by 
using  quadratic  regression  of  on  x^.  The  regression  sum 
of  squares  has  a  Chi-squared  distribution  with  (p+l)(p+2)/2 
degrees  of  freedom.  For  further  details,  the  reader  is  referred 
to  Gnanadesikan  (1977,  p.  169)  wherein  other  relevant  tests  are 
also  given, 

Graphical  tests,  as  generalizations  of  univariate  plots  on 
probability  paper  to  assess  univariate  normality,  are  based  on 
radii  and  angles. 


i 

I 

r 
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Let  0.  =  angle  which  2-  makes  with  the  abscissa-axis 
in  bivariate  normal  case.  Plot  of  6^  against  — — —  If 
there  is  bivariate  normality  in  data,  both  of  these  plots 
should  be  linear.  In  the  case  of  p-dimensions ,  there  would 
be  p~l  angles.  For  one  of  the  angles,  a  probability  plot 
is  still  appropriate  since  it  is  still  uniformly  distributed 
on  (0,  2-n).  For  the  remaining  (p-2)  angles,  the  distributions 
are  proportional  to  (sin  0  j  =1,2,  ..., 

p-2.  For  these  angles,  the  appropriate  plots  are  obtained 
by  plotting  n  ordered  values  against  n  quantiles  of  this 
distribution . 

(iii)  Tests  of  randomness  using  stochastic  processes 

Liehetrau  (1979)  has  studied  some  statistics  utilized 
in  tests  of  randomness  based  on  the  variance-time  curve  of  the 

Poisson  process.  Consider  the  following  notation. 

{Tj}  =  real  weakly  stationary  point  process, 

N(A)  =  the  number  of  { T ^ }  in  A  for  Korel  set  A, 

The  mean  and  variance  of  N(A)  are  given  by, 

M(A)  =  E(N( A) ) , 

V ( A)  =  E[N(A)  -  M(A)]2. 

When  A  is  an  interval  (x,  x  +  t),  the  mean  and  variance  of  the 
process  are  denoted  by 

M(A)  =  M{(x,  x+t)}  =  M(t )  , 

V ( A)  =  V(t). 

Suppose  T^ ,  T2 ,  . . . ,  T^  have  jeen  observed  in  (Q,  T).  V(t)  is 
estimated  by 

T 

V(t)  =  i  /  [(N(x,  x+t)  -  ^]2dx 
0  1 
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If  {Tj }  is  a  Poisson  process  with  rate  p  and  V(u)  =  , 

then  Liebetrau  (1976)  showed  that 

nl(t)  =  Te[v <tT*)  -  V(tT^)], 

i 

B  =  j(l-3y)  ,  0  <  y  <  1 

converges  weakly  as  T  +  «  to  Gausian  Process  with  covariance 
K(t,u)  -  j  p 2 ( 3t 2  u-t 3 )  ,  0  <  t  <  u  <  1.  Similarly 
n£(  t )  =  T122[V(tc)  -  V(tc)]  converges  weakly  as  T  +  ®  to  n<" 
which  is  Gaussian  with  covariance 

KC(t,  u)  =  CJK(t,u)  . 

Tests  of  randomness  are  based  on 

C?  =  /1nC(t)dt 
0 

C  r  i  "p 
^2  =  /  CnL(t) ) 2dt 
0 

Upper  percentage  points  are  given  for  by  Liebetrau  (1979, 
p.  38). 

9 •  Applications 

TeSwS  of  randomness  are  routinely  applied  in  many  areas 
of  science  and  engineering.  A  few  recent  examples  from 
lottery  games  are  discussed  ana  examples  are  given  from 
evolutionary  paleontology  and  mining  disasters. 


t 


ki 
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8 . 1  Evolutionary  Paleontology 

Several  tests  of  randomness  have  been  applied  in 
studying  the  pattern  of  evolution  in  the  fossil  record  which 
is  basic  to  the  study  of  evolutionary  paleontology.  The  model  of 
progressive  specialization  through  the  phanerozoic  is  studied 
using  taxonomic  and  morphological  evidence.  In  a  recent 
paper,  Flessa  and  Levinton  (1975)  have  used  tests  of  goodness 
of  fit  of  the  Poisson  distribution  and  also  the  run  test  for 
studying  randomness.  Using  data  obtained  from  patterns  of 
origination  within  taxa  and  patterns  of  dominance  diversity, 
they  have  distinguished  random  and  non-random  components. 

8 . 2  Mining  Disasters 

Jarrett  (1979)  has  given  a  revised  table  of  time 
intervals  in  days  between  explosions  in  mines  in  England 
during  1851  -  1962.  These  data  were  originally  given  by 
Maguire,  Pearson  and  Wynn  (1952).  Table  1  of  Jarrett  (1979) 
is  reproduced  here  as  Table  III.  One  of  the  earliest  problems 
studied  for  the  data,  has  been  the  test  of  hypothesis  of  the 
randomness  of  the  occurrence  of  coal  mining  disasters. 

Assuming  that  the  mining  disasters  occur  at  random, 
the  time  interval  between  them  has  a  Gamma  distribution. 

Since  the  data  given  are  in  terms  of  these  time  intervals, 
the  test  of  randomness  is  carried  through  the  test  of  goodness 
of  fit  of  the  Gamma  distribution. 
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Table  III 


Time 

intervals 

in 

days 

between  explosions  in 

mines , 

narcn  Is, 

1851 

to 

March 

22, 

1962 

(to  be 

read 

down  c 

157 

65 

53 

93 

127 

176 

22 

1205 

1643 

312 

123 

186 

17 

24 

218 

55 

61 

644 

54 

536 

2 

23 

538 

91 

2 

93 

78 

467 

326 

145 

124 

92 

187 

143 

C 

59 

99 

871 

1312 

75 

12 

197 

34 

16 

378 

315 

326 

48 

348 

364 

4 

431 

101 

27 

36 

59 

275 

123 

745 

37 

10 

16 

41 

144 

15 

61 

54 

456 

217 

19 

216 

154 

139 

45 

31 

1 

217 

498 

120 

156 

80 

9  5 

42 

6 

215 

13 

113 

49 

275 

47 

12 

25 

1 

208 

11 

189 

32 

131 

20 

129 

33 

19 

250 

29 

137 

345 

388 

182 

66 

1630 

^  6 

78 

80 

112 

4 

20 

151 

255 

292 

29 

232 

2Q2 

3 

43 

15 

81 

361 

194 

4 

217 

826 

3  6 

324 

193 

72 

286 

312 

224 

368 

7 

40 

110 

56 

134 

96 

114 

354 

566 

307 

18 

12 

276 

31 

420 

124 

108 

307 

462 

336 

1358 

29 

16 

96 

95 

50 

188 

275 

228 

19 

2366 

190 

88 

70 

125 

120 

233 

78 

806 

329 

952 

97 

225 

41 

34 

203 

28 

17 

517 

330 

632 
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Using  the  density  of  the  Gamma  as 


/ 


f(x> 


TTaT 


-x/ S  .  a-1 


x  >  0 


0  ,  elsewhere, 

the  maximum  likelihood  estimates  of  a  and  8  are  obtained  by 
solving  the  following  equations 


log  a 


a  B  =  X 

r'(a)  _ 


=  log  X  -  (log  X) 


r(a) 

Omitting  0  as  an  observation,  we  use  n  =  189  observations 
with  X"  =  213.53,  log  X  =  4.556  ,  we  use  tables  of  Pearson  and 
Hartley  (1966)  to  obtain 

u  =  0.7384 

A 

and  6  =  290.53 

Using  Kolmogorov -Smirnov  statistic,  we  accept  the 
hypothesis  at  .05  level  of  signif icance . 


9 . 3  Occurrence  of  Prizes  in  a  State  Lottery  Game 

In  legalized  lottery  games,  presently  being  played  in 
many  states  in  American  and  various  countries  throughout 
the  world,  the  games  are  designed  in  such  a  way  that  the 
prizes  occur  at  random  with  preassigned  probabilities.  One 
of  the  main  problems  in  these  games  is  to  keep  control  of 
the  integrity  and  honesty  of  the  game.  When  several  persons 
receive  prizes  in  the  same  locality  or  several  prizes  occur 
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i 

i  »  »  -*& 

within  a  short  interval  of  time  during  the  progress  of  the 
game,  the  process  of  randomness  is  likely  to  be  challenged. 

Most  often,  tests  of  randomness  are  performed  to  check  whether 
the  prize  structure  in  the  game  is  random.  Consider  the 
situation  that  the  probability  of  getting  a  prize  is  0.01. 

Then  the  hypothesis  of  randomness  means  that  in  the  independent 
trials  resulting  in  losers  and  winners,  the  winners  occur 
with  probability  0.01.  The  test  statistic  in  this  case  can 
be  obtained  from  goodness  of  fit  test  for  the  geometric 
distribution  with  probability  p  =  0.01.  The  data  given  in 
the  following  were  obtained  from  a  game  designed  for  a  state 
lottery.  The  frequency  distribution  of  the  number  of  winning 
tickets  in  a  given  sequence  of  1500  tickets  generated,  is  given 
in  Table  IV.  The  number  of  losers  between  successive  winners 
is  called  waiting  time.  Table  V  provides  the  list  of  180 
waiting  times  for  certain  large  tier  prizes  in  another  state 
lottery  game.  The  test  for  randomness  in  this  case  is  the  usual 
Chi-squared  test  for  goodness  of  fit.  The  Chi-squared  statistic 
is  4.98  in  this  case  and  the  comparison  .is  made  with  the 
tabulated  Chi-squared  value  of  9.24  with  5  degrees  of  freedom 
at  10%  level  of  significance.  Hence  we  say  that  the  prizes  occur 
at  random. 

It  can  be  seen  from  the  following  straightforward 
argument  that  the  geometric  distribution  is  the  discrete 
analog  of  the  exponential  distribution.  Consider  the 
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Table  IV 

Frequency  of  occurrence  of  prizes 


i 


Number  of 
winning  tickets 

Observed  frequency 

Expected  frequency 

1 

51 

54 

2 

40 

43 

3 

27 

35 

4-5 

51 

50 

6-7 

37 

32 

8-11 

35 

34 

12  or  more 

32 

25 

273 

273 

26 


Table  V 

Waiting  times  for  high  tier  denomination  prizes  in  a  state 
lottery  game  (given  in  multiples  of  300  and  arranged  in 
increasing  order  in  every  column. 


33 

5 

1 

6 

13 

20 

4 

5 

5 

8 

35 

15 

7 

24 

32 

21 

7 

6 

15 

18 

47 

18 

9 

28 

34 

43 

17 

25 

17 

22 

49 

29 

26 

29 

43 

43 

24 

29 

27 

26 

60 

29 

40 

42 

47 

62 

37 

38 

57 

47 

63 

38 

42 

45 

65 

68 

42 

40 

68 

52 

72 

55 

50 

55 

66 

75 

60 

46 

70 

59 

77 

58 

55 

68 

71 

75 

79 

60 

89 

84 

86 

77 

72 

70 

74 

77 

SO 

63 

106 

88 

91 

92 

93 

77 

80 

80 

88 

70 

108 

121 

106 

112 

94 

81 

94 

89 

96 

79 

113 

137 

120 

120 

105 

87 

98 

115 

102 

94 

113 

140 

126 

122 

119 

109 

130 

116 

156 

118 

117 

141 

132 

172 

153 

119 

134 

155 

189 

129 

147 

143 

154 

195 

164 

190 

163 

162 

206 

134 

148 

162 

160 

19  5 

166 

232 

163 

219 

210 

151 

212 

198 

235 

253 

282 

266 

191 

227 

219 

259 

219 

216 

274 

335 

442 

392 

422 

273 

307 

574 

289 

258 

X 

=  106. 

,67 

n  = 

180 

2? 


exponential  distribution  with  parameter  X.  The  discretized 
probability  between  [X]  -  1  and  [X]  is 

/Xj  Xe"Xtdt  =  e-XCx]-X[  i_e-^  ]  . 

[x]  -  1 

Let  p  =  1-e-^,  then  we  have  for  [x]  =  y  a  positive  integer, 
=  pCl-p)^-1  which  is  the  geometric  distribution. 

We  test  the  hypothesis  that  the  waiting  times  are 

exponentially  distributed  since  the  continuous  analog  of  the 
geometric  distribution  is  the  exponential  distribution.  The 
frequency  distribution  is  given  in  Table  VI  for  the  waiting 
times  in  Table  V.  The  Chi-squared  value  for  the  table 
under  the  exponential  model  is  6.68  and  we  again  accept 
the  hypothesis  of  randomness  at  10  percent  level  of  significance 
as  the  tabulated  Chi-squared  value  is  7.78  for  4  degrees  of 
freedom. 

S . 4  Randomness  of  Digits  in  a  Daily  Lottery  Numbers  Game 

The  tests  of  randomness  for  numbers  generated  for 
lottery  games  as  well  as  for  awarding  prizes  are  made  in 
the  same  manner.  Consider  the  following  sequence  of  three- 
digit  random  numbers  in  a  state  lottery  "Number  Game"  for 
100  drawings  as  given  in  Table  VII.  To  obtain  the  frequency 
test  for  the  random  digits  we  form  the  Table  VIII  giving  the 
distribution  of  300  digits  in  100  numbers.  Testing  for 
uniformity  provides  the  confirmation  of  the  hypothesis  of 
randomness  at  1 0  percent  level  of  significance. 


Interval 

Observed  frequency 

Expected  frequency 

0-60 

64 

77 

61  -  120 

59 

45 

121  -  180 

27 

29 

181  -  240 

16 

12 

241  -  300 

8 

10 

301  or  more 

6 

7 

180 


180 


29 


Table  VII 

Drawings  in  a  number  game  in  a  state  lottery 


308 

967 

521 

492 

407 

646 

514 

559 

458 

145 

554 

991 

751 

259 

730 

804 

657 

432 

972 

407 

098 

109 

98b 

261 

748 

130 

743 

551 

167 

682 

037 

691 

717 

002 

688 

709 

146 

544 

706 

909 

089 

503 

163 

7  5.J 

710 

613 

340 

081 

114 

036 

876 

758 

972 

580 

738 

519 

123 

568 

854 

760 

810 

351 

742 

392 

810 

892 

983 

988 

415 

460 

392 

623 

533 

743 

454 

726 

190 

714 

750 

407 

516 

953 

024 

253 

107 

953 

080 

035 

988 

798 

969 

547 

158 

472 

216 
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